DNA constructs encoding Bacillus thuringiensis toxins from strain A20

ABSTRACT

Recombinant DNA coding for an insecticidally active form of the Bacillus thuringiensis endotoxin which DNA is derived from the chromosome plasmids of the A20 strain of Bacillus thuringiensis. The recombinant genes may be of the 6.6, 5.3 or 4.5 types. Further discloses a truncated chimetic endotoxin-producing gene derived from the 5.3 and 4.5-type genes of strain A20.

This invention relates to recombinant DNA, and in particular to such DNA coding Bacillus thuringiensis endotoxin.

The organism Bacillus thuringiensis carries genes which encode protein endotoxins which are insecticidal to a variety of agronomically important insects. They are not however harmful to many benign insects, or to earthworms, birds, fish or mammals. These endotoxins are thus useful as agricultural insecticides, particularly against Lepidopteran pests. Bacillus thuringiensis strains have been used as agricultural insecticides for many years.

Recently, Bacillus thuringiensis endotoxin genes have been engineered into dicotyledonous plants and shown to confer protection against insect attack. The Bacillus thuringiensis kurstaki strain, A20, is the subject of our prior unpublished copending UK application No 8730132 filed 24 Dec. 1987. Strain A20 has been deposited at the National Collection of Industrial and Marine Bacteria (NCIMB) at Aberdeen, Scotland on 20 Oct. 1987 under the NCIMB Accession Number 12570. It is more active than the commonly used variety kurstaki strain, HD-1. Strain A20 carries three different, but highly related, endotoxin genes similar to those described in the scientific literature as 6.6-, 5.3-, and 4.5-type genes. The first two genes are carried in one or more copies on plasmids and also on the bacterial chromosome, while the 4.5- type gene is only found on the bacterial chromosome.

According to the present invention we provide recombinant DNA coding for an insecticidally active form of the Bacillus thuringiensis endotoxin which DNA is derived from the A20 strain of Bacillus thuringiensis. The source of the DNA maybe either the chromosome or a plasmid.

In a further aspect, our invention comprises recombinant DNA coding for an insecticidally-active form of the Bacillus thuringiensis endotoxin which is derived from a 6.6-type endotoxin gene carried on a plasmid harboured by strain A20. The maximum molecular weight of such endotoxins is about 103,000 Dalton.

We have made such recombinant DNA comprising the first 2910 basepairs (970 amino acid codons) of the N-terminal coding region of a plasmid-derived 6.6-type endotoxin gene from Strain A20. The 6.6-type construct we have made encodes a fusion protein that includes 28 amino acid codons derived from the pUC19 vector DNA. The Bacillus thuringiensis-derived portion of the recombinant DNA has 10 basepair changes as compared with the analogous plasmid-derived 6.6-type gene from Bacillus thuringiensis strain HD 73 (Adang et al, Gene 36, 1985, 289-300). Surprisingly, all of the base changes occur within the generally conserved N-terminal portion of the gene previously designated as a "highly-conserved" region (Whiteley and Schnepf, Ann. Rev. Microbiol., 40, 1986, pp549-576). Nine of the ten changes are clustered in a 0.35 kilobase segment which does not overlap with any of the 5 regions shown to be conserved in all types of endotoxin genes analysed to date (Sanchis et al., Molecular Microbiol., 3, 1989, pp229- 238). Four of the 10 base changes result in amino acid substitutions.

The invention further comprises recombinant DNA coding for an insecticidally-active form of the Bacillus thuringiensis endotoxin which is derived from a 5.3-type endotoxin gene carried on a plasmid harboured by strain A20. The maximum molecular weight of the endotoxin is about 78,000 Daltons.

We have made such recombinant DNA comprising the first 2161 basepairs (696 amino acid codons) of the N-terminal coding region of a plasmid-derived 5.3-type endotoxin gene from Strain A20. This endotoxin gene differs from the analogous plasmid-derived 5.3-type gene from Bacillus thuringiensis strain HD-1 (Geiser etal., Gene 48, 1986, pp109-118) in that a deoxyguanosine base present in the published sequence at nucleotide 2024 is not present, resulting in a reading frame shift relative to the published structural gene. This frame shift results in a termination of the endotoxin gene product at amino acid residue 696, within the Bacillus thuringiensis-derived portion of the recombinant DNA. The recombinant DNA, however, encodes an insecticidally-active 5.3 type endotoxin protein.

In a further aspect, our invention comprises recombinant DNA coding for an insecticidally-active Bacillus thuringiensis endotoxin which is a chimera derived from sequences from at least two separate Bacillus thuringiensis genes. The molecular weight of the chimera may be of the order of 110,000 Daltons. Preferably the link or links between the sequences fall outside the hypervariable regions of such genes. In a more specific aspect, our invention comprises recombinant DNA coding for an insecticidally-active form of the Bacillus thuringiensis endotoxin comprising the first 1692 basepairs (564 amino acid codons) of the amino-terminal coding region from a 5.3-type endotoxin gene derived from Strain A20 and a restriction endonuclease Hind III-generated internal fragment of 1131 basepairs (377 amino acid codons) from a 4.5-type endotoxin gene derived from strain A20.

This chimetic gene encodes a novel fusion protein which also includes 28 amino acid codons derived from the pUC19 vector DNA. The Bacillus thuringiensis portion of the recombinant DNA has 97.1% DNA sequence homology with the 5.3-type gene from which it was largely derived. The internal Hind III-generated fragment used in the chimera and derived from the 4.5-type gene only differs from the analogous Hind III fragment from the 5.3-type gene by a 78 basepair insertion and 7 other physically separate base changes. These differences result in the in-frame addition of 26 amino acids due to the insertion, and in seven other amino acid changes; thus the novel protein has 91% amino acid homology with the 5.3-type gene product.

Recombinant genes according to our invention encoding insecticidally-active endotoxins may be of varying lengths. When cloning DNA from the A20 strain chromosome, it is convenient to use bacteriophage λ vectors or other cloning vectors that sequester the recombinant DNA from host cell enzymes that might cause homologous recombination.

Three Escherichia coli strains containing the cloned 6.6- and 5.3-type endotoxin genes from A20 and the chimetic endotoxin gene have been deposited at the National Collection of Industrial and Marine Bacteria at Aberdeen, Scotland. These are:

    ______________________________________                                                                       NCIMB                                            Strain                        Accession                                        designation  Endotoxin Protein                                                                               No.                                              ______________________________________                                         E. coli MC1022/pJH1                                                                         5.3-type fusion protein                                                                         40049                                            E. coli MC1022/pJH2                                                                         Novel chimeric molecule                                                                         40050                                            E. coli MC1022/pJH10                                                                        6.6-type fusion protein                                                                         40211                                            ______________________________________                                    

The first two deposits were made on 23 Sep. 1988 and the third on 15 Sep. 1989.

The invention may be further understood by reference to the drawings, in which:

FIGS. 1A, 1B and 1C show diagramatically the structure of 6.6, 5.3 and chimetic-type genes, respectively as cloned;

FIG. 2 illustrates schematically the construction of pJH1 (=pIC49), pJH2(=pIC47) from publicly available vector pUC19 and DNA obtained from A20 (NCIMB 12570);

FIGS. 3A1, 3A2, 3A3, 3B1, 3B2, 3B3, 4A1, 4A2, 4A3, 4B1, 4B2, 4B3, 5A1, 5A2, 5A3, 5B1, 5B2 and 5B3 show the base sequence, the amino acid sequence and the main restriction sites of the pJH10 endotoxin gene;

FIGS. 6A1, 6A2, 6A3, 6B1, 6B2, 6B3, 7A1, 7A2, 7A3, 7B1, 7B2, 7B3, 8A1, 8A2, 8A3, 8B1 and 8B2 show the base sequence, the amino acid sequence and the main restriction sites of the pJH1 endotoxin gene;

FIGS. 9A1, 9A2, 9B1, 9B2, 10A1, 10A2, 10B1, 10B2, 11A1, 11A2, 11B1, 11B2, 12A1 and 12A2 show the base sequence, the amino acid sequence and the main restriction sites of the chimetic pJH2 endotoxin gene;

FIG. 13 shows the characteristics of the pJH10 endotoxin gene product;

FIG. 14 shows the characteristics of the pJH1 endotoxin gene product;

FIG. 15 shows the characteristics of the pJH2 endotoxin gene product.

With further reference to FIG. 1, in this diagram H represents HindIII, D=DraI, X=XmnI, C= ClaI, E=EcoR1, R=EcoRV, P=Ps+I, B=BclI and K =KpnI. A dot shows a base change, while x indicates a base change and amino acid substitution in the 6.6-type gene of the invention as compared with the 6.6-type gene of strain HD73. The line under the 6.6-type gene indicates the 2.2 kilobase DraI fragment sequenced, while the line under the 5.3-type gene indicates the approximate length of the coding sequence. The heavy broken line indicates vector DNA. In FIG. 2, H is again HindIII, while N represents NdeI.

With reference to FIGS. 5A1, 5A2, 5A3, 5B1, 5B2 and 5B3, it should be noted that the sequence from nucleotide 2228 to nucleotide 2907 (internal DraI to HindIII) has been taken from the published literature. All other sequence data given for constructs of the inventions has been obtained independently by us.

In Heliothis virescens bioassays the chimetic gene product is of comparable potency to that of the 5.3-type gene over the normal bioassay time-course of six days.

The preferred method of producing recombinant DNA according to the invention is by culturing a sample of deposited strains NCIMB 40049, 40050 or 40211, obtainable from the National Collection of Industrial and Marine Bacteria, under suitable conditions in appropriate media. Such conditions and media are well-known to the art. Variant forms of DNA according to the invention may be made by cloning suitable DNA sequences from bacteria of strain A20 (also deposited at the NCIMB, under No 12570).

The invention further comprises insecticidal compositions comprising Bacillus thuringiensis endotoxin produced by expression of recombinant DNA according to the invention: as well as a process of combating insects which comprises exposing them to effective amounts of such endotoxin.

Insecticidal compositions according to the invention may be obtained from cultured cell suspensions by appropriate lysis techniques. Preparations of lysed cells (i.e. cell extracts) may be used directly, or concentrated by lyophilisation, followed by reconstitution.

The process of the invention is generally carried out by incorporation of the nonviable cell extracts into, or onto, the insect food source. Alternatively the novel insecticidal gene can be introduced into the genome of food plants normally attacked by Lepidopteran pests. For this purpose suitable plant regulatory DNA sequences are inserted adjacent the recombinant DNA in operative relation therewith so as to allow expression of the endotoxin in the plant. Specific examples of commercially important plants to be protected in this manner are maize, cotton, and tobacco. Various plant transformation methods are known to the art, for example the use of Ti plasmid vectors (for dicots) protoplast regeneration, embryo microinjection and microprojectiles.

Insects which are combated by the invention include various Lepidopteran pests, for example, those in Table 1.

                  TABLE 1                                                          ______________________________________                                         COMMON NAME        LATIN NAME                                                  ______________________________________                                         European Corn Borer                                                                               Ostrinia nubilalis                                          Tobacco Budworm    Heliothis virescens                                         Tobacco Hornworm   Manduca sexta                                               Corn Earworm       Heliothis zea                                               Beet Armyworm      Spodoptera exigua                                           Fall Armyworm      Spodoptera frugiperda                                       Diamondback Moth   Plutella xylostella                                         Cabbage Looper     Trichoplusia ni                                             Eastern Spruce Budworm                                                                            Choristoneura fumiferana                                    Gypsy Moth         Lymantria dispar                                            ______________________________________                                    

The following Examples illustrate the invention.

EXAMPLE 1

Cloning of the plasmid-derived 6.6-type endotoxin gene carried on pJH10 and of chromosomally-derived endotoxin genes.

Covalently-closed circular (ccc) plasmid DNA was prepared from Bacillus thuringiensis Strain A20 (NCIMB 12570) by techniques well-known in the art. Large molecular weight (greater than 40 kilobases) plasmid ccc DNA was isolated by size fractionation on 10%-40% sucrose step gradients prior to digestion with restriction endonuclease Hind III and ligation into the Hind III--digested plasmid vector pUC19. The ligation reacted mixture was transformed into E. coli Strain MC1022, an ampicillin-sensitive strain with the genotype ara D139, Δ(ara, leu) 7697, Δ(lac Z) M15, gal U, gal K, str A. Amipicillin-resistant transformed colonies containing the 6.6 kilobase Hind III fragment were detected by hybridisation of lysed colonies fixed to nitrocellulose using a radioactively-labelled fragment of the Strain HD73 6.6-type endotoxin gene as a probe. Isolate pJH10 was further characterised by DNA hybridisation, restriction endonuclease mapping, and DNA sequence analysis techniques, all of which are well known in the art.

Endotoxin genes were cloned from chromosomal DNA prepared from Strain A20 (NCIMB 12570) as follows:

A 500 ml culture of Strain A20 was grown in L-broth at 37° C., with shaking, until an O.D.₆₀₀ =1.00. Cells were harvested by centrifugation at 8000 rounds per minute (rpm) for 10 minutes at 4° C., then re-suspended in 5 ml TES buffer (50 mM TriS-HCl pH7.5, 50mM NaCl, 5 mM EDTA). Cells were treated for 30 minutes at 37° C. with lysozyme (0.5 mg/ml final concentration) and RNase (0.1 mg/ml final concentration taken from a stock solution of 5 mg/ml boiled at 100° C for 5 minutes prior to use). Lysis was completed by the addition of Sarcosyl to give a final concentration of 0.8% and incubation at 37° C. for 60 minutes in the presence of Pronase (0.5mg/ml final concentration taken from a stock solution of 5 mg/ml pre-incubated at 37° C. for 60 minutes prior to use). Lysate volume was adjusted to 9.0 ml in the 50 mM Tris-HCl pH 7.6, 10 mM EDTA, prior to the addition of 9.2 g caesium chloride (CsCl). After the CsCl dissolved, 1.25 ml of a 5 mg/ml solution of ethidium bromide was added prior to isopyonic centrifugation of the mixture at 40,000 rpm for 48 hours at 15° C.

After removal of CsCl and ethidium bromide by conventional techniques, an aliquot of purified chromosomal DNA was partially digested with the restriction endonuclease EcoR1 prior to ligation into EcoR1-digested bacteriophage λ EMBL4 vector DNA. Ligation reaction mixtures were packaged into viable phage particles using a commercially-available kit from Amersham International PLC. The resultant recombinant phage particles were selected by growth on E. coli host strain PE392, a P2 lysogen of strain LE392 which has the genotype hsd R514 (r_(k) ⁻, M_(k) ⁺), sup E44, SUP F58, lacYl or Δ(lac12Y), gal K2, gal T22, met B1, trp55. Recombinant phage carrying one or more endotoxin genes were detected by hybridisation of lysed plaques fixed to nitrocellulose using a radiolabelled fragment of the Strain A20 6.6-type plasmid-derived endotoxin gene as a probe.

Plaques containing endotoxin genes were purified and characterised by restriction endonuclease mapping techniques well known in the art.

EXAMPLE 2

Construction of plasmid pJH2 containing the chimeric endotoxin gene.

DNA was prepared from plasmids pIC41 and pIC45, the derivation of which is shown in FIG. 2. pIC41 DNA was digested with restriction endonuclease HindIII and joined to the isolated, internal 1131 basepair HindIII-generated fragment from plasmid pIC45 using T4 DNA ligase. Methods for DNA preparation, DNA fragment purification and in vitro ligation reactions are all well known to the art. The ligation reaction mixture was introduced by transformation into E. coli strain MC1022, an F⁻ strain with the genotype ara D139, Δ(ara, leu) 7697, Δ(lacZ) M15, galU, gal, str A. Ampicillin-resistant transformed colonies containing the 1131 basepair fragment were detected by hybridisation of lysed colonies fixed to nitrocellulose filters using the radioactively labelled 1131 basepair fragment as a probe. DNA samples from colonies reacting to the probe were checked for the presence of the introduced 1131 basepair fragment in the correct orientation. Transformation of E.coli cells made competent by treatment with calcium chloride, colony blotting, nick-translation of DNA to prepare radioactive probes, DNA hybridisation techniques, and restriction endonuclease "mapping" techniques are all well known in the art.

pJH1 was made in a similar manner from pIC34 and pIC41, as shown in FIG. 2.

EXAMPLE 3

Preparation of insecticidal extracts according to the invention.

E.coli strain MC1022/pIC47 was grown in 400 ml Luria-broth supplemented with 50 μg Ampicillin per ml overnight at 37 C. with shaking. Cells were harvested by centrifugation at 8000 rpm for 10 minutes at 4 C. and resuspended in 26.4 ml lysing buffer (150 mM NaCl, 20 mM Tris pH 8, 5 mM EDTA, 7 mM β-mercaptoethanol). The resuspended cells were placed on ice and then 0.3 ml 0.1M phenylmethylsulphonylfluoride (PMSF) was added, followed by 3 ml of a freshly prepared solution of 10 mg lysozyme per ml, followed by a further addition of 0.3 ml PMSF. The mixture was left for 30 minutes on ice. The cell suspension was subjected to sonic disruption by treatment for 45 seconds with a one centimeter diameter sonicator probe operated at a frequency of 12 microns on an MSE Soniprep 150 sonifier. The sonicated sample was then centrifuged at 11000 rpm for 20 minutes at 4° C. (using 50 cc capacity sterile centrifuge tubes). The supernatant was recovered by pouring off into a clean, sterile plastic container then dialyzed overnight at 4° C. in 20 mM phosphate buffer (mono- and di-basic) pH 7.2. Insecticidal extracts can be stored frozen, or lyophilised and stored frozen.

EXAMPLE 4

Efficacy of insecticidal extracts prepared from E.coli strains MC1022/pJH1 and MC1022/pJH2 in controlling Heliothis virescens and MC1022/pJH10 in controlling Heliothis virescens and Ostrinia nubilalis.

Insecticidal extract was added to molten insect diet (e.g. standard Heliothis diet) to give a final concentration of, for example, 1 ml per 10 g of diet (equivalent to approximately 510 ppm freeze-dried extract per treatment). 2.5 ml of treated diet was added to each of 10 plastic pots, covered and allowed to solidify. One first-instar larva was added to each pot and incubated at 27° C. for six days, at which time the results were recorded. In tests with lower treatment concentrations (e.g. 0.1 ml extract per 10 g of diet), surviving larvae at six days were left for longer periods of observation to allow documentation of altered stability. Results of typical experiments are shown in Tables 2 and 3.

EXAMPLE 5

Efficacy of insecticidal extracts prepared from strains MC1022/pJH2 and pJH1 on Lepidopteran larvae.

Lyophilised extracts from E.coli strains MC1022/pJB2 and MC1022/pdH1 were reconstituted in distilled water to give a stock solution which was then added, in varying amounts, to molten diet as described above in Example 4. Results of a typical bioassay experiment on Plutella xylostella (Diamondback moth, DBM), Heliothis zea (Corn earworm, CEW) and Trichoplusia ni (Cabbage looper, CL) are given in Table 4.

The work described herein was all done in conformity with physical and biological containment requirements specified in the NIH and GMAC guidelines.

                                      TABLE 2                                      __________________________________________________________________________     Bioassay Results 6 Days After Treatment                                        Inhibition of Development                                                                     Percent                                                                             Larval                                                                             (number of larvae per stage)                           Construct Plasmid                                                                             Mortality                                                                           Instar:                                                                            1  2E                                                                               2L                                                                               3E                                                                               3L                                                                               4                                           __________________________________________________________________________     Untreated controls                                                                       --   0        5  2 8 4 1 --                                                         5        -- --                                                                               3 4 4 5                                           Vector control                                                                           pUC19                                                                               0        -- 4 3 3 --                                                                               --                                                         0        -- 2 --                                                                               4 2 2                                           Positive control                                                                         pIC18*                                                                              100      -- --                                                                               --                                                                               --                                                                               --                                                                               --                                                         80       2  --                                                                               --                                                                               --                                                                               --                                                                               --                                          5.3-type endotoxin                                                                       pJH1 90       1  --                                                                               --                                                                               --                                                                               --                                                                               --                                          Chimeric endotoxin                                                                       pJH2 90       1  --                                                                               --                                                                               --                                                                               --                                                                               --                                          __________________________________________________________________________      *carries an endotoxin gene from B. thuringiensis HD73                          E = early                                                                      L = late                                                                 

                                      TABLE 3                                      __________________________________________________________________________     Heliothis virescens and Ostrinia nubilalis Bioassay Results                    Six Days after Treatment                                                                       H. virescens   O. nubilalis                                                    PERCENT LARVAL PERCENT LARVAL                                  GENE TYPE PLASMID                                                                              MORTALITY                                                                              SIZE*  MORTALITY                                                                              SIZE*                                   __________________________________________________________________________     Untreated Controls                                                                       --     0      2nd-4th instar                                                                         0      6.2 mm                                  Vector Control                                                                           pUC19  0      2nd-3rd instar                                                                         5      5.7 mm                                  Positive Control.sup.+                                                                   pIC18*                                                                               30      1st instar                                                                            50      2.0 mm                                  6.6 Type Gene                                                                            pJH10 50      1st instar                                                                            40      2.0 mm                                  __________________________________________________________________________      *Developmental stage or length of surviving larvae                             .sup.+ Carries a 6.5 type endotoxin gene from B. thuringiensis strain HD7

                                      TABLE 4                                      __________________________________________________________________________                    % Mortality                                                                               % Stunting                                           Construct Plasmid                                                                             DBM CEW CL DBM CEW CL                                           __________________________________________________________________________     Untreated control                                                                        --    0  0   0  0    0   0                                           Vector control                                                                           pUC19                                                                                0  0   0  0    0   0                                           Positive control                                                                         pIC18                                                                               100 5   0  0   95  100                                          5.3-type endotoxin                                                                       pJH1  95 0   0  5   100 45                                           Chimeric endotoxin                                                                       pJH2 100 5   0  0   95  65                                           __________________________________________________________________________

DEPOSIT OF MICROORGANISMS

The specification refers to four microorganisms that have been deposited. These are shown below. All four were deposited at the National Collections of Industrial and Marine Bacterial Limited (NCIMB) of PO BOX 31, 135 Abbey Road, Aberdeen, AB9 8DG, Scotland.

    ______________________________________                                         Strain                      NCIMB                                              designation   Date of Deposit                                                                              Accession No.                                      ______________________________________                                         Bacillus thuringiensis                                                                       20 Oct 1987   12570                                              kurstaki A20                                                                   E. coli MC1022/pJH1                                                                          23 Sep 1988   40049                                              E. coli MC1022/pJH2                                                                          23 Sep 1988   40050                                              E. coli MC1022/pJH10                                                                         15 Sep 1989   40211                                              ______________________________________                                    

The Bacillus thuringiensis strain A20 is fully described in our European Patent Publication no 325037, the disclosure of which is incorporated herein by reference. The three E. coli strains are of well known type, differing only in the plasmids they carry, full details of which are given elsewhere in the present specification.

    __________________________________________________________________________     SEQUENCE LISTING                                                               (1) GENERAL INFORMATION:                                                       (iii) NUMBER OF SEQUENCES: 6                                                   (2) INFORMATION FOR SEQ ID NO:1:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 2990 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                        TATGTTT TAAATTGTAGTAATGAAAAACAGTATTATATCATAATGAATTGGTATCTTAAT60                AAAAGAGATGGAGGTAACTTATGGATAACAATCCGAACATCAATGAATGCATTCCTTATA120                ATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGGTGGAGAAAGAATAGAAAC TGGTTACA180               CCCCAATCGATATTTCCTTGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTG240                CTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAATTTTTGGTCCCTCTCAATGGG300                ACGCATTTCTTGTACAAATTGAACAGTTAA TTAACCAAAGAATAGAAGAATTCGCTAGGA360               ACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTATCAAATTTACGCAGAATCTT420                TTAGAGAGTGGGAAGCAGATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAAT480                TCAATGA CATGAACAGTGCCCTTACAACCGCTATTCCTCTTTTGGCAGTTCAAAATTATC540               AAGTTCCTCTTTTATCAGTATATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAGAG600                ATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGATGCCGCGACTATCAATAG TCGTTATA660               ATGATTTAACTAGGCTTATTGGCAACTATACAGATTATGCTGTACGCTGGTACAATACGG720                GATTAGAACGTGTATGGGGACCGGATTCTAGAGATTGGGTAAGGTATAATCAATTTAGAA780                GAGAATTAACACTAACTGTATTAGATATCG TTGCTCTGTTCCCGAATTATGATAGTAGAA840               GATATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAATTTATACAAACCCAGTATTAG900                AAAATTTTGATGGTAGTTTTCGAGGCTCGGCTCAGGGCATAGAAAGAAGTATTAGGAGTC960                CACATTT GATGGATATACTTAACAGTATAACCATCTATACGGATGCTCATAGGGGTTATT1020              ATTATTGGTCAGGGCATCAAATAATGGCTTCTCCTGTCGGGTTTTCGGGGCCAGAATTCA1080               CGTTTCCGCTATATGGAACCATGGGAAATGCAGCTCCACAACAACGTATTGT TGCTCAAC1140              TAGGTCAGGGCGTGTATAGAACATTATCCTCTACTTTTTATAGAAGACCTTTTAATATAG1200               GGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGAATTTGCTTATGGAACCTCCT1260               CAAATTTGCCATCCGCTGTATACAGAAAAA GCGGAACGGTAGATTCGCTGGATGAAATAC1320              CACCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAGTCATCGATTAAGCCATGTTT1380               CAATGTTTCGTTCAGGCTCTAGTAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCTCTT1440               GGATACA TCGTAGTGCTGAATTTAATAATATAATTGCATCGGATAGTATTACTCAAATCC1500              CTGCAGTGAAGGGAAACTTTCTTTTTAATGGTTCTGTAATTTCAGGACCAGGATTTACTG1560               GTGGGGACTTAGTTAGATTAAATAGTAGTGGAAATAACATTCAGAATAGAGG GTATATTG1620              AAGTTCCAATTCACTTCCCATCGACATCTACCAGATATCGAGTTCGTGTACGGTATGCTT1680               CTGTAACCCCGATTCACCTCAACGTTAATTGGGGTAATTCATCCATTTTTTCCAATACAG1740               TACCAGCTACAGCTACGTCATTAGATAATC TACAATCAAGTGATTTTGGTTATTTTGAAA1800              GTGCCAATGCTTTTACATCTTCATTAGGTAATATAGTAGGTGTTAGAAATTTTAGTGGGA1860               CTGCAGGAGTGATAATAGACAGATTTGAATTTATTCCAGTTACTGCAACACTCGAGGCTG1920               AATATAA TCTGGAAAGAGCGCAGAAGGCGGTGAATGCGCTGTTTACGTCTACAAACCAAC1980              TAGGGCTAAAAACAAATGTAACGGATTATCATATTGATCAAGTGTCCAATTTAGTTACGT2040               ATTTATCGGATGAATTTTGTCTGGATGAAAAGCGAGAATTGTCCGAGAAAGT CAAACATG2100              CGAAGCGACTCAGTGATGAACGCAATTTACTCCAAGATTCAAATTTCAAAGACATTAATA2160               GGCAACCAGAACGTGGGTGGGGCGGAAGTACAGGGATTACCATCCAAGGAGGGGATGACG2220               TATTTAAAGAAAATTACGTCACACTATCAG GTACCTTTGATGAGTGCTATCCAACATATT2280              TGTATCAAAAAATCGATGAATCAAAATTAAAAGCCTTTACCCGTTATCAATTAAGAGGGT2340               ATATCGAAGATAGTCAAGACTTAGAAATCTATTTAATTCGCTACAATGCAAAACATGAAA2400               CAGTAAA TGTGCCAGGTACGGGTTCCTTATGGCCGCTTTCAGCCCAAAGTCCAATCGGAA2460              AGTGTGGAGAGCCGAATCGATGCGCGCCACACCTTGAATGGAATCCTGACTTAGATTGTT2520               CGTGTAGGGATGGAGAAAAGTGTGCCCATCATTCGCATCATTTCTCCTTAGA CATTGATG2580              TAGGATGTACAGACTTAAATGAGGACCTAGGTGTATGGGTGATCTTTAAGATTAAGACGC2640               AAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCTCGAAGAGAAACCATTAGTAGGAG2700               AAGCGCTAGCTCGTGTGAAAAGAGCGGAGA AAAAATGGAGAGACAAACGTGAAAAATTGG2760              AATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGAATCTGTAGATGCTTTATTTGTAA2820               ACTCTCAATATGATCAATTACAAGCGGATACGAATATTGCCATGATTCATGCGGCAGATA2880               AACGTGT TCATAGCATTCGAGAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTG2940              AAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAA2990                         (2) INFORMATION FOR SEQ ID NO:2:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 2815 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                        AAATTGTAGTAATGAAAAACAGTATTATATCATAATGAATTGGTATCTTAATAAAAGAGA60                 TGGAGGTAACTTATGGATAACAATCCGAACATCAATGAATGCATTCCTTA TAATTGTTTA120               AGTAACCCTGAAGTAGAAGTATTAGGTGGAGAAAGAATAGAAACTGGTTACACCCCAATC180                GATATTTCCTTGTCGCTAACGCAATTTCTTTTGAGTGAATTTGTTCCCGGTGCTGGATTT240                GTGTTAGGACTAGTTGATATAATATGGG GAATTTTTGGTCCCTCTCAATGGGACGCATTT300               CTTGTACAAATTGAACAGTTAATTAACCAAAGAATAGAAGAATTCGCTAGGAACCAAGCC360                ATTTCTAGATTAGAAGGACTAAGCAATCTTTATCAAATTTACGCAGAATCTTTTAGAGAG420                TGGGA AGCAGATCCTACTAATCCAGCATTAAGAGAAGAGATGCGTATTCAATTCAATGAC480               ATGAACAGTGCCCTTACAACCGCTATTCCTCTTTTTGCAGTTCAAAATTATCAAGTTCCT540                CTTTTATCAGTATATGTTCAAGCTGCAAATTTACATTTATCAGTTTTGAG AGATGTTTCA600               GTGTTTGGACAAAGGTGGGGATTTGATGCCGCGACTATCAATAGTCGTTATAATGATTTA660                ACTAGGCTTATTGGCAACTATACAGATCATGCTGTACGCTGGTACAATACGGGATTAGAG720                CGTGTATGGGGACCGGATTCTAGAGATT GGATAAGATATAATCAATTTAGAAGAGAATTA780               ACACTAACTGTATTAGATATCGTTTCTCTATTTCCGAACTATGATAGTAGAACGTATCCA840                ATTCGAACAGTTTCCCAATTAACAAGAGAAATTTATACAAACCCAGTATTAGAAAATTTT900                GATGG TAGTTTTCGAGGCTCGGCTCAGGGCATAGAAGGAAGTATTAGGAGTCCACATTTG960               ATGGATATACTTAACAGTATAACCATCTATACGGATGCTCATAGAGGAGAATATTATTGG1020               TCAGGGCATCAAATAATGGCTTCTCCTGTAGGGTTTTCGGGGCCAGAATT CACTTTTCCG1080              CTATATGGAACTATGGGAAATGCAGCTCCACAACAACGTATTGTTGCTCAACTAGGTCAG1140               GGCGTGTATAGAACATTATCGTCCACTTTATATAGAAGACCTTTTAATATAGGGATAAAT1200               AATCAACAACTATCTGTTCTTGACGGGA CAGAATTTGCTTATGGAACCTCCTCAAATTTG1260              CCATCCGCTGTATACAGAAAAAGCGGAACGGTAGATTCGCTGGATGAAATACCGCCACAG1320               AATAACAACGTGCCACCTAGGCAAGGATTTAGTCATCGATTAAGCCATGTTTCAATGTTT1380               CGTTC AGGCTTTAGTAATAGTAGTGTAAGTATAATAAGAGCTCCTATGTTCTCTTGGATA1440              CATCGTAGTGCTGAATTTAATAATATAATTCCTTCATCACAAATTACACAAATACCTTTA1500               ACAAAATCTACTAATCTTGGCTCTGGAACTTCTGTCGTTAAAGGACCAGG ATTTACAGGA1560              GGAGATATTCTTCGAAGAACTTCACCTGGCCAGATTTCAACCTTAAGAGTAAATATTACT1620               GCACCATTATCACAAAGATATCGGGTAAGAATTCGCTACGCTTCTACCACAAATTTACAA1680               TTCCATACATCAATTGACGGAAGACCTA TTAATCAGGGGAATTTTTCAGCAACTATGAGT1740              AGTGGGAGTAATTTACAGTCCGGAAGCTTTAGGACTGTAGGTTTTACTACTCCGTTTAAC1800               TTTTCAAATGGATCAAGTGTATTTACGTTAAGTGCTCATGTCTTCAATTCAGGCAATGAA1860               GTTTA TATAGATCGAATTGAATTTGTTCCGGCAGAAGTAACCTTTGAGGCAGAATATGAT1920              TTAGAAAGAGCACAAAAGGCGGTGAATGAGCTGTTTACTTCTTCCAATCAAATCGGGTTA1980               AAAACAGATGTGACGGATTATCATATTGATCAAGTATCCAATTTAGTTGA GTGTTTATCT2040              GATGAATTTTGTCTGGATGAAAAAAAAGAATTGTCCGAGAAAGTCAAACATGCGAACGAC2100               TTAGTGATGAGCGGAATTTACTTCAAGATCCAAACTTTAGAGGGATCAATAGACAACTAG2160               ACCGTGGCTGGAGAGGAAGTACGGATAT TACCATCCAAGGAGGCGATGACGTATTCAAAG2220              AGAATTACGTTACGCTATTGGGTACCTTTGATGAGTGCTATCCAACGTATTTATATCAAA2280               AAATAGATGAGTCGAAATTAAAAGCCTATACCCGTTACCAATTAAGAGGGTATATCGAAG2340               ATAGT CAAGACTTAGAAATCTATTTAATTCGCTACAATGCCAAACACGAAACAGTAAATG2400              TGCCAGGTACGGGTTCCTTATGGCCGCTTTCAGCCCCAAGTCCAATCGGAAAATGTGCCC2460               ATCATTCCCATCATTTCTCCTTGGACATTGATGTTGGATGTACAGACTTA AATGAGGACT2520              TAGGTGTATGGGTGATATTCAAGATTAAGACGCAAGATGGCCATGCAAGACTAGGAAATC2580               TAGAATTTCTCGAAGAGAAACCATTAGTAGGAGAAGCACTAGCTCGTGTGAAAAGAGCGG2640               AGAAAAAATGGAGAGACAAACGTGAAAA ATTGGAATGGGAAACAAATATTGTTTATAAAG2700              AGGCAAAAGAATCTGTAGATGCTTTATTTGTAAACTCTCAATATGATAGATTACAAGCGG2760               ATACCAACATCGCGATGATTCATGCGGCAGATAAACGCGTTCATAGCATTCGAGA2815                    (2) INFORMATION FOR SEQ ID NO:3:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 3066 base pairs                                                    (B) TYPE: nucleic acid                                                         (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: DNA (genomic)                                              (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                        GTTAACACCCTGGGTCAAAAATTGATATTTAGTAAAATTAGTTGCACTTTGTGCATTTTT60                 TCA TAAGATGAGTCATATGTTTTAAATTGTAGTAATGAAAAACAGTATTATATCATAATG120               AATTGGTATCTTAATAAAAGAGATGGAGGTAACTTATGGATAACAATCCGAACATCAATG180                AATGCATTCCTTATAATTGTTTAAGTAACCCTGAAGTAGAAGTATTAGG TGGAGAAAGAA240               TAGAAACTGGTTACACCCCAATCGATATTTCCTTGTCGCTAACGCAATTTCTTTTGAGTG300                AATTTGTTCCCGGTGCTGGATTTGTGTTAGGACTAGTTGATATAATATGGGGAATTTTTG360                GTCCCTCTCAATGGGACGCATTTCTT GTACAAATTGAACAGTTAATTAACCAAAGAATAG420               AAGAATTCGCTAGGAACCAAGCCATTTCTAGATTAGAAGGACTAAGCAATCTTTATCAAA480                TTTACGCAGAATCTTTTAGAGAGTGGGAAGCAGATCCTACTAATCCAGCATTAAGAGAAG540                AGA TGCGTATTCAATTCAATGACATGAACAGTGCCCTTACAACCGCTATTCCTCTTTTTG600               CAGTTCAAAATTATCAAGTTCCTCTTTTATCAGTATATGTTCAAGCTGCAAATTTACATT660                TATCAGTTTTGAGAGATGTTTCAGTGTTTGGACAAAGGTGGGGATTTGA TGCCGCGACTA720               TCAATAGTCGTTATAATGATTTAACTAGGCTTATTGGCAACTATACAGATCATGCTGTAC780                GCTGGTACAATACGGGATTAGAGCGTGTATGGGGACCGGATTCTAGAGATTGGATAAGAT840                ATAATCAATTTAGAAGAGAATTAACA CTAACTGTATTAGATATCGTTTCTCTATTTCCGA900               ACTATGATAGTAGAACGTATCCAATTCGAACAGTTTCCCAATTAACAAGAGAAATTTATA960                CAAACCCAGTATTAGAAAATTTTGATGGTAGTTTTCGAGGCTCGGCTCAGGGCATAGAAG1020               GAA GTATTAGGAGTCCACATTTGATGGATATACTTAACAGTATAACCATCTATACGGATG1080              CTCATAGAGGAGAATATTATTGGTCAGGGCATCAAATAATGGCTTCTCCTGTAGGGTTTT1140               CGGGGCCAGAATTCACTTTTCCGCTATATGGAACTATGGGAAATGCAGC TCCACAACAAC1200              GTATTGTTGCTCAACTAGGTCAGGGCGTGTATAGAACATTATCGTCCACTTTATATAGAA1260               GACCTTTTAATATAGGGATAAATAATCAACAACTATCTGTTCTTGACGGGACAGAATTTG1320               CTTATGGAACCTCCTCAAATTTGCCA TCCGCTGTATACAGAAAAAGCGGAACGGTAGATT1380              CGCTGGATGAAATACCGCCACAGAATAACAACGTGCCACCTAGGCAAGGATTTAGTCATC1440               GATTAAGCCATGTTTCAATGTTTCGTTCAGGCTTTAGTAATAGTAGTGTAAGTATAATAA1500               GAG CTCCTATGTTCTCTTGGATACATCGTAGTGCTGAATTTAATAATATAATTCCTTCAT1560              CACAAATTACACAAATACCTTTAACAAAATCTACTAATCTTGGCTCTGGAACTTCTGTCG1620               TTAAAGGACCAGGATTTACAGGAGGAGATATTCTTCGAAGAACTTCACC TGGCCAGATTT1680              CAACCTTAAGAGTAAATATTACTGCACCATTATCACAAAGATATCGGGTAAGAATTCGCT1740               ACGCTTCTACCACAAATTTACAATTCCATACATCAATTGACGGAAGACCTATTAATCAGG1800               GGAATTTTTCAGCAACTATGAGTAGT GGGAGTAATTTACAGTCCGGAAGCTTTAGGACTG1860              TAGGTTTTACTACTCCGTTTAACTTTTCAAATGGATCAAGTGTATTTACGTTAAGTGCTC1920               ATGTCTTCAATTCAGGCAATGAAGTTTATATAGATCGAATTGAATTTGTTCCGGCAGAAG1980               TAA CCTTTGAGGCAGAATATGATTTAGAAAGAGCACAAAAGGCGGTGAATGAGCTGTTTA2040              CTTCTTCCAATCAAATCGGGTTAAAAACAGATGTGACGGATTATCATATTGATCAAGTAT2100               CCAATTTAGTTGAGTGTTTATCAGATGAATTTTGTCTGGATGAAAAACA AGAATTGTCCG2160              AGAAAGTCAAACATGCGAAGCGACTTAGTGATGAGCGGAATTTACTTCAAGATCCAAACT2220               TCAGAGGGATCAATAGACAACTAGACCGTGGCTGGAGAGGAAGTACGGATATTACCATCC2280               AAGGAGGCGATGACGTATTCAAAGAG AATTACGTTACGCTATTGGGTACCTTTGATGAGT2340              GCTATCCAACGTATTTATATCAAAAAATAGATGAGTCGAAATTAAAAGCCTATACCCGTT2400               ATCAATTAAGAGGGTATATCGAAGATAGTCAAGACTTAGAAATCTATTTAATTCGCTACA2460               ATG CAAAACATGAAACAGTAAATGTGCCAGGTACGGGTTCCTTATGGCCGCTTTCAGCCC2520              AAAGTCCAATCGGAAAGTGTGGAGAGCCGAATCGATGCGCGCCACACCTTGAGTGGAATC2580               CTGACTTAGATTGTTCGTGTAGGGATGGAGAAAAGTGTGCCCATCATTC GCATCATTTCT2640              CCTTAGACATTGATGTAGGATGTACAGACTTAAATGAGGACCTAGGTGTATGGGTGATCT2700               TTAAGATTAAGACGCAAGATGGGCACGCAAGACTAGGGAATCTAGAGTTTCTCGAAGAGA2760               AACCATTAGTAGGAGAAGCGCTAGCT CGTGTGAAAAGAGCGGAGAAAAAATGGAGAGACA2820              AACGTGAAAAATTGGAATGGGAAACAAATATCGTTTATAAAGAGGCAAAAGAATCTGTAG2880               ATGCTTTATTTGTAAACTCTCAATATGATCAATTACAAGCGGATACGAATATTGCCATGA2940               TTC ATGCGGCAGATAAACGTGTTCATAGCATTCGAGAAGCTTGGCGTAATCATGGTCATA3000              CCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAG3060               CATAAA 3066                                                                    (2) INFORMATION FOR SEQ ID NO:4:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 969 amino acids                                                    (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                        MetAspAsnAsnProAsnIleAsnGluCysIlePro TyrAsnCysLeu                              151015                                                                         SerAsnProGluValGluValLeuGlyGlyGluArgIleGluThrGly                               2025 30                                                                        TyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer                               354045                                                                         GluPheValProGlyAlaGlyPheValLeuGlyLeuVal AspIleIle                              505560                                                                         TrpGlyIlePheGlyProSerGlnTrpAspAlaPheLeuValGlnIle                               657075 80                                                                      GluGlnLeuIleAsnGlnArgIleGluGluPheAlaArgAsnGlnAla                               859095                                                                         IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnIle TyrAlaGlu                              100105110                                                                      SerPheArgGluTrpGluAlaAspProThrAsnProAlaLeuArgGlu                               115120 125                                                                     GluMetArgIleGlnPheAsnAspMetAsnSerAlaLeuThrThrAla                               130135140                                                                      IleProLeuLeuAlaValGlnAsnTyrGlnValProLeuLeuSerVal                                145150155160                                                                  TyrValGlnAlaAlaAsnLeuHisLeuSerValLeuArgAspValSer                               165170 175                                                                     ValPheGlyGlnArgTrpGlyPheAspAlaAlaThrIleAsnSerArg                               180185190                                                                      TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspT yrAlaVal                              195200205                                                                      ArgTrpTyrAsnThrGlyLeuGluArgValTrpGlyProAspSerArg                               210215220                                                                       AspTrpValArgTyrAsnGlnPheArgArgGluLeuThrLeuThrVal                              225230235240                                                                   LeuAspIleValAlaLeuPheProAsnTyrAspSerArgArgTy rPro                              245250255                                                                      IleArgThrValSerGlnLeuThrArgGluIleTyrThrAsnProVal                               260265 270                                                                     LeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlyIleGlu                               275280285                                                                      ArgSerIleArgSerProHisLeuMetAspIleLeuAsnSerIle Thr                              290295300                                                                      IleTyrThrAspAlaHisArgGlyTyrTyrTyrTrpSerGlyHisGln                               305310315320                                                                   IleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro                               325330335                                                                      LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArgIle ValAla                              340345350                                                                      GlnLeuGlyGlnGlyValTyrArgThrLeuSerSerThrPheTyrArg                               355360365                                                                       ArgProPheAsnIleGlyIleAsnAsnGlnGlnLeuSerValLeuAsp                              370375380                                                                      GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaVal                                385390395400                                                                  TyrArgLysSerGlyThrValAspSerLeuAspGluIleProProGln                               40541041 5                                                                     AsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis                               420425430                                                                      ValSerMetPheArgSerGlySerSerSerSerValSerIleIl eArg                              435440445                                                                      AlaProMetPheSerTrpIleHisArgSerAlaGluPheAsnAsnIle                               450455460                                                                      Ile AlaSerAspSerIleThrGlnIleProAlaValLysGlyAsnPhe                              465470475480                                                                   LeuPheAsnGlySerValIleSerGlyProGlyPheThrGlyGlyAsp                               485490495                                                                      LeuValArgLeuAsnSerSerGlyAsnAsnIleGlnAsnArgGlyTyr                               500505510                                                                      IleGluValProIleHisPheProSerThrSerThrArgTyrArgVal                               515520525                                                                      ArgValArgTyrAlaSerValThrProIleHisLeuAsnValAsnTrp                                530535540                                                                     GlyAsnSerSerIlePheSerAsnThrValProAlaThrAlaThrSer                               545550555560                                                                    LeuAspAsnLeuGlnSerSerAspPheGlyTyrPheGluSerAlaAsn                              565570575                                                                      AlaPheThrSerSerLeuGlyAsnIleValGlyValArgAsnPheS er                              580585590                                                                      GlyThrAlaGlyValIleIleAspArgPheGluPheIleProValThr                               595600605                                                                       AlaThrLeuGluAlaGluTyrAsnLeuGluArgAlaGlnLysAlaVal                              610615620                                                                      AsnAlaLeuPheThrSerThrAsnGlnLeuGlyLeuLysThrAsnVal                               625 630635640                                                                  ThrAspTyrHisIleAspGlnValSerAsnLeuValThrTyrLeuSer                               645650655                                                                       AspGluPheCysLeuAspGluLysArgGluLeuSerGluLysValLys                              660665670                                                                      HisAlaLysArgLeuSerAspGluArgAsnLeuLeuGlnAspSerAsn                               675680685                                                                      PheLysAspIleAsnArgGlnProGluArgGlyTrpGlyGlySerThr                               690695700                                                                      GlyIleT hrIleGlnGlyGlyAspAspValPheLysGluAsnTyrVal                              705710715720                                                                   ThrLeuSerGlyThrPheAspGluCysTyrProThrTyrLeuTyrGln                                725730735                                                                     LysIleAspGluSerLysLeuLysAlaPheThrArgTyrGlnLeuArg                               740745750                                                                       GlyTyrIleGluAspSerGlnAspLeuGluIleTyrLeuIleArgTyr                              755760765                                                                      AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrp                                770775780                                                                     ProLeuSerAlaGlnSerProIleGlyLysCysGlyGluProAsnArg                               785790795800                                                                   Cys AlaProHisLeuGluTrpAsnProAspLeuAspCysSerCysArg                              805810815                                                                      AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIle                                820825830                                                                     AspValGlyCysThrAspLeuAsnGluAspLeuGlyValTrpValIle                               835840845                                                                      Phe LysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu                              850855860                                                                      PheLeuGluGluLysProLeuValGlyGluAlaLeuAlaArgValLys                               865 870875880                                                                  ArgAlaGluLysLysTrpArgAspLysArgGluLysLeuGluTrpGlu                               885890895                                                                      T hrAsnIleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe                              900905910                                                                      ValAsnSerGlnTyrAspGlnLeuGlnAlaAspThrAsnIleAlaMet                                915920925                                                                     IleHisAlaAlaAspLysArgValHisSerIleArgGluAlaTrpArg                               930935940                                                                      AsnHisGlyHi sSerCysPheLeuCysGluIleValIleArgSerGln                              945950955960                                                                   PheHisThrThrTyrGluProGluAla                                                    965                                                                            (2) INFORMATION FOR SEQ ID NO:5:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 695 amino acids                                                    (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           (ii) MOLECULE TYPE: protein                                                    (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                        MetAspAsnAsnProAsnIleAsnGluCysIleProTyrAsnCysLeu                               1 51015                                                                        SerAsnProGluValGluValLeuGlyGlyGluArgIleGluThrGly                               202530                                                                         TyrThrPr oIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer                              354045                                                                         GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIleIle                               50 5560                                                                        TrpGlyIlePheGlyProSerGlnTrpAspAlaPheLeuValGlnIle                               65707580                                                                       GluGlnLeuIleAs nGlnArgIleGluGluPheAlaArgAsnGlnAla                              859095                                                                         IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnIleTyrAlaGlu                                100105110                                                                     SerPheArgGluTrpGluAlaAspProThrAsnProAlaLeuArgGlu                               115120125                                                                      GluMetArgIleGln PheAsnAspMetAsnSerAlaLeuThrThrAla                              130135140                                                                      IleProLeuPheAlaValGlnAsnTyrGlnValProLeuLeuSerVal                               145150 155160                                                                  TyrValGlnAlaAlaAsnLeuHisLeuSerValLeuArgAspValSer                               165170175                                                                      ValPheGlyGlnA rgTrpGlyPheAspAlaAlaThrIleAsnSerArg                              180185190                                                                      TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspHisAlaVal                               195 200205                                                                     ArgTrpTyrAsnThrGlyLeuGluArgValTrpGlyProAspSerArg                               210215220                                                                      AspTrpIleArgTyrAsnGlnPh eArgArgGluLeuThrLeuThrVal                              225230235240                                                                   LeuAspIleValSerLeuPheProAsnTyrAspSerArgThrTyrPro                               245 250255                                                                     IleArgThrValSerGlnLeuThrArgGluIleTyrThrAsnProVal                               260265270                                                                      LeuGluAsnPheAsp GlySerPheArgGlySerAlaGlnGlyIleGlu                              275280285                                                                      GlySerIleArgSerProHisLeuMetAspIleLeuAsnSerIleThr                               290 295300                                                                     IleTyrThrAspAlaHisArgGlyGluTyrTyrTrpSerGlyHisGln                               305310315320                                                                   IleMetAlaSerProVal GlyPheSerGlyProGluPheThrPhePro                              325330335                                                                      LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArgIleValAla                               340 345350                                                                     GlnLeuGlyGlnGlyValTyrArgThrLeuSerSerThrLeuTyrArg                               355360365                                                                      ArgProPheAsnIleGlyI leAsnAsnGlnGlnLeuSerValLeuAsp                              370375380                                                                      GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaVal                               385390 395400                                                                  TyrArgLysSerGlyThrValAspSerLeuAspGluIleProProGln                               405410415                                                                      AsnAsnAsnValProPr oArgGlnGlyPheSerHisArgLeuSerHis                              420425430                                                                      ValSerMetPheArgSerGlyPheSerAsnSerSerValSerIleIle                               435 440445                                                                     ArgAlaProMetPheSerTrpIleHisArgSerAlaGluPheAsnAsn                               450455460                                                                      IleIleProSerSerGlnIleThrGln IleProLeuThrLysSerThr                              465470475480                                                                   AsnLeuGlySerGlyThrSerValValLysGlyProGlyPheThrGly                               485 490495                                                                     GlyAspIleLeuArgArgThrSerProGlyGlnIleSerThrLeuArg                               500505510                                                                      ValAsnIleThrAlaPro LeuSerGlnArgTyrArgValArgIleArg                              515520525                                                                      TyrAlaSerThrThrAsnLeuGlnPheHisThrSerIleAspGlyArg                               5305 35540                                                                     ProIleAsnGlnGlyAsnPheSerAlaThrMetSerSerGlySerAsn                               545550555560                                                                   LeuGlnSerGlySerPheArgT hrValGlyPheThrThrProPheAsn                              565570575                                                                      PheSerAsnGlySerSerValPheThrLeuSerAlaHisValPheAsn                               580 585590                                                                     SerGlyAsnGluValTyrIleAspArgIleGluPheValProAlaGlu                               595600605                                                                      ValThrPheGluAlaGluTyrAs pLeuGluArgAlaGlnLysAlaVal                              610615620                                                                      AsnGluLeuPheThrSerSerAsnGlnIleGlyLeuLysThrAspVal                               625630 635640                                                                  ThrAspTyrHisIleAspGlnValSerAsnLeuValGluCysLeuSer                               645650655                                                                      AspGluPheCysLeuAspGlu LysLysGluLeuSerGluLysValLys                              660665670                                                                      HisAlaAsnAspLeuValMetSerGlyIleTyrPheLysIleGlnThr                               675 680685                                                                     LeuGluGlySerIleAspAsn                                                          690695                                                                         (2) INFORMATION FOR SEQ ID NO:6:                                               (i) SEQUENCE CHARACTERISTICS:                                                  (A) LENGTH: 969 amino acids                                                    (B) TYPE: amino acid                                                           (C) STRANDEDNESS: single                                                       (D) TOPOLOGY: linear                                                           ( ii) MOLECULE TYPE: protein                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                        MetAspAsnAsnProAsnIleAsnGluCysIleProTyrAsnCysLeu                               151015                                                                         SerAsnProGluValGluValLeuGl yGlyGluArgIleGluThrGly                              202530                                                                         TyrThrProIleAspIleSerLeuSerLeuThrGlnPheLeuLeuSer                               3540 45                                                                        GluPheValProGlyAlaGlyPheValLeuGlyLeuValAspIleIle                               505560                                                                         TrpGlyIlePheGlyProSerGlnTrpAspAlaPheLe uValGlnIle                              65707580                                                                       GluGlnLeuIleAsnGlnArgIleGluGluPheAlaArgAsnGlnAla                               8590 95                                                                        IleSerArgLeuGluGlyLeuSerAsnLeuTyrGlnIleTyrAlaGlu                               100105110                                                                      SerPheArgGluTrpGluAlaAspProThrA snProAlaLeuArgGlu                              115120125                                                                      GluMetArgIleGlnPheAsnAspMetAsnSerAlaLeuThrThrAla                               130135 140                                                                     IleProLeuPheAlaValGlnAsnTyrGlnValProLeuLeuSerVal                               145150155160                                                                   TyrValGlnAlaAlaAsnLeuHisLeuSerValLe uArgAspValSer                              165170175                                                                      ValPheGlyGlnArgTrpGlyPheAspAlaAlaThrIleAsnSerArg                               180185 190                                                                     TyrAsnAspLeuThrArgLeuIleGlyAsnTyrThrAspHisAlaVal                               195200205                                                                      ArgTrpTyrAsnThrGlyLeuGluArgValTrpGly ProAspSerArg                              210215220                                                                      AspTrpIleArgTyrAsnGlnPheArgArgGluLeuThrLeuThrVal                               225230235 240                                                                  LeuAspIleValSerLeuPheProAsnTyrAspSerArgThrTyrPro                               245250255                                                                      IleArgThrValSerGlnLeuThrArgGluIle TyrThrAsnProVal                              260265270                                                                      LeuGluAsnPheAspGlySerPheArgGlySerAlaGlnGlyIleGlu                               275280 285                                                                     GlySerIleArgSerProHisLeuMetAspIleLeuAsnSerIleThr                               290295300                                                                      IleTyrThrAspAlaHisArgGlyGluTyrTyrTrpSerGlyH isGln                              305310315320                                                                   IleMetAlaSerProValGlyPheSerGlyProGluPheThrPhePro                               325330 335                                                                     LeuTyrGlyThrMetGlyAsnAlaAlaProGlnGlnArgIleValAla                               340345350                                                                      GlnLeuGlyGlnGlyValTyrArgThrLeuSerSe rThrLeuTyrArg                              355360365                                                                      ArgProPheAsnIleGlyIleAsnAsnGlnGlnLeuSerValLeuAsp                               370375380                                                                      GlyThrGluPheAlaTyrGlyThrSerSerAsnLeuProSerAlaVal                               385390395400                                                                   TyrArgLysSerGlyThrValAspSerLeuAspGluIle ProProGln                              405410415                                                                      AsnAsnAsnValProProArgGlnGlyPheSerHisArgLeuSerHis                               420425 430                                                                     ValSerMetPheArgSerGlyPheSerAsnSerSerValSerIleIle                               435440445                                                                      ArgAlaProMetPheSerTrpIleHisArgSerAlaGlu PheAsnAsn                              450455460                                                                      IleIleProSerSerGlnIleThrGlnIleProLeuThrLysSerThr                               465470475 480                                                                  AsnLeuGlySerGlyThrSerValValLysGlyProGlyPheThrGly                               485490495                                                                      GlyAspIleLeuArgArgThrSerProGlyGlnIleS erThrLeuArg                              500505510                                                                      ValAsnIleThrAlaProLeuSerGlnArgTyrArgValArgIleArg                               515520 525                                                                     TyrAlaSerThrThrAsnLeuGlnPheHisThrSerIleAspGlyArg                               530535540                                                                      ProIleAsnGlnGlyAsnPheSerAlaThrMetSerSerGlySerAs n                              545550555560                                                                   LeuGlnSerGlySerPheArgThrValGlyPheThrThrProPheAsn                               565570 575                                                                     PheSerAsnGlySerSerValPheThrLeuSerAlaHisValPheAsn                               580585590                                                                      SerGlyAsnGluValTyrIleAspArgIleGluPheVal ProAlaGlu                              595600605                                                                      ValThrPheGluAlaGluTyrAspLeuGluArgAlaGlnLysAlaVal                               610615620                                                                       AsnGluLeuPheThrSerSerAsnGlnIleGlyLeuLysThrAspVal                              625630635640                                                                   ThrAspTyrHisIleAspGlnValSerAsnLeuValGluCys LeuSer                              645650655                                                                      AspGluPheCysLeuAspGluLysGlnGluLeuSerGluLysValLys                               660665 670                                                                     HisAlaLysArgLeuSerAspGluArgAsnLeuLeuGlnAspProAsn                               675680685                                                                      PheArgGlyIleAsnArgGlnLeuAspArgGlyTrpArgGlyS erThr                              690695700                                                                      AspIleThrIleGlnGlyGlyAspAspValPheLysGluAsnTyrVal                               70571071572 0                                                                  ThrLeuLeuGlyThrPheAspGluCysTyrProThrTyrLeuTyrGln                               725730735                                                                      LysIleAspGluSerLysLeuLysAlaTyrThrArgTyrGl nLeuArg                              740745750                                                                      GlyTyrIleGluAspSerGlnAspLeuGluIleTyrLeuIleArgTyr                               755760765                                                                      AsnAlaLysHisGluThrValAsnValProGlyThrGlySerLeuTrp                               770775780                                                                      ProLeuSerAlaGlnSerProIleGlyLysCysGlyGluProAsnArg                                785790795800                                                                  CysAlaProHisLeuGluTrpAsnProAspLeuAspCysSerCysArg                               805810 815                                                                     AspGlyGluLysCysAlaHisHisSerHisHisPheSerLeuAspIle                               820825830                                                                      AspValGlyCysThrAspLeuAsnGluAspLeuGlyValTrp ValIle                              835840845                                                                      PheLysIleLysThrGlnAspGlyHisAlaArgLeuGlyAsnLeuGlu                               850855860                                                                      Ph eLeuGluGluLysProLeuValGlyGluAlaLeuAlaArgValLys                              865870875880                                                                   ArgAlaGluLysLysTrpArgAspLysArgGluLysLeuGluTrpG lu                              885890895                                                                      ThrAsnIleValTyrLysGluAlaLysGluSerValAspAlaLeuPhe                               90090591 0                                                                     ValAsnSerGlnTyrAspGlnLeuGlnAlaAspThrAsnIleAlaMet                               915920925                                                                      IleHisAlaAlaAspLysArgValHisSerIleArgGluAlaTrpAr g                              930935940                                                                      AsnHisGlyHisThrCysPheLeuCysGluIleValIleArgSerGln                               945950955960                                                                    PheHisThrThrTyrGluProGluAla                                                   965                                                                        

We claim:
 1. Recombinant DNA coding for an insecticidally active form of the Bacillus thuringiensis endotoxin comprising DNA derived from a 6.6-type endotoxin gene of the A20 strain of Bacillus thuringiensis.
 2. Recombinant DNA as claimed in claim 1 wherein the DNA is chromosomally derived.
 3. Recombinant DNA as claimed in claim 1 comprising the first 2910 basepairs (970 amino acid codons) of the N-terminal coding region of a plasmid-derived 6.6-type endotoxin gene from Strain A20.
 4. Recombinant DNA as claimed in claim 3 which is derived from the E. coli strain deposited at the NCIMB under the number
 40211. 5. Recombinant DNA coding for an insecticidally-active form of the Bacillus thuringiensis endotoxin comprising the first 1692 basepairs (564 amino acid codons) of the amino-terminal coding region from a 5.3-type endotoxin gene derived from Strain A20 and a restriction endonuclease Hind III-generated internal fragment of 1131 basepairs (377 amino acid codons) from a 4.5-type endotoxin gene derived from strain A20.
 6. Recombinant DNA as claimed in claim 5 which is derived from the E. coli strain deposited at the NCIMB under the number
 40050. 7. Recombinant DNA as claimed in claim 1 or claim 5 comprising a transcriptional initiation region operative in plants and positioned for transcription of the DNA coding for the Bacillus thuringiensis endotoxin. 